Many problems are more easily (or possibly) solved without a traditional likelihood
“At its heart, ABC is a very simple method: numerical evaluation of the likelihood function is replaced with an assessment of how likely it is the model could have produced the observed data, based on simulating pseudo-data from the model and comparing it to the observed data.” (Sisson et al. 2018)
\[y \sim Gamma(shape,~rate)\]
gamma_sim <- function(ii, y) {
shape <- truncnorm::rtruncnorm(1, a = 0, mean = 0, sd = 5)
rate <- truncnorm::rtruncnorm(1, a = 0, mean = 0, sd = 5)
log_lik <- sum(dgamma(y, shape, rate, log = TRUE))
return(c(ii = ii, shape = shape, rate = rate, log_lik = log_lik))
}
gamma_sim(1, 1:10) ii shape rate log_lik
1.0000000 2.6818141 0.2662371 -28.9345392
Be careful about how many samples you will take:
samples <- matrix(NA, nrow = 1e5, ncol = 4)
colnames(samples) <- c("ii", "shape", "rate", "log_lik")
format(object.size(samples), units = "Mb")[1] "1.5 Mb"
ABC works best with loops
purrr/furrr functions ii shape rate log_lik
[1,] 1 11.206811 5.781687 -1596.7511
[2,] 2 6.452868 6.694615 -423.7230
[3,] 3 3.522727 7.328538 -994.0511
[4,] 4 6.905452 10.720679 -852.5172
[5,] 5 5.142266 5.288729 -381.0074
[6,] 6 4.010726 0.539865 -2619.3155
# A tibble: 6 × 4
ii shape rate log_lik
<dbl> <dbl> <dbl> <dbl>
1 94107 3.47 3.43 -354.
2 69478 3.46 3.40 -354.
3 13372 3.45 3.41 -354.
4 44377 3.48 3.42 -354.
5 13074 3.41 3.36 -354.
6 57396 3.48 3.41 -354.
# A tibble: 1 × 2
shape rate
<dbl> <dbl>
1 3.50 3.48
gamma_sim_uniform <- function(ii, y) {
shape <- runif(1, 0.1, 10)
rate <- runif(1, 0.1, 10)
log_lik <- sum(dgamma(y, shape, rate, log = TRUE))
return(c(ii = ii, shape = shape, rate = rate, log_lik = log_lik))
}
for (ii in seq_len(1e5)) {
samples[ii, ] <- gamma_sim_uniform(ii, y)
}
samples <- samples |>
as_tibble() |>
arrange(desc(log_lik)) |>
slice(1:5000)
samples |> summarise(across(2:3, ~ median(.x)))# A tibble: 1 × 2
shape rate
<dbl> <dbl>
1 3.89 3.90
\[y(t) = a e^{-b e^{-c t}}\]
while() loopstotal_iterations <- 0
sample_counter <- 1
tic()
while (sample_counter <= nrow(samples)) {
total_iterations <- total_iterations + 1
a <- truncnorm::rtruncnorm(1, a = 0, mean = 5, sd = 2)
b <- rnorm(1, 0, 4)
c <- rexp(1, rate = 5)
predicted <- Gompertz(t = GGsim$t, a, b, c)
MAE <- mean(abs(GGsim$y - predicted))
if (MAE <= min_MAE) {
samples[sample_counter, ] <- c(sample_counter, a, b, c, MAE)
sample_counter <- sample_counter + 1
}
}
toc()0.467 sec elapsed
[1] 44066
ii a b c MAE
[1,] 1 5.104372 2.9192575 0.57257715 0.3769681
[2,] 2 4.860579 0.4039046 0.20875071 0.3054701
[3,] 3 5.041581 0.1805600 0.07335673 0.3854870
[4,] 4 5.556490 1.8963527 0.59041063 0.4885992
[5,] 5 4.787738 4.5050994 0.80856828 0.4753298
[6,] 6 4.526859 1.5379045 0.79077532 0.4348779
[7,] 7 5.530129 1.6338703 0.39480792 0.4797085
[8,] 8 5.140731 1.3861181 0.44430259 0.2226250
[9,] 9 4.490962 1.2919265 0.77094821 0.4559979
[10,] 10 4.864262 1.9311239 0.56906294 0.3441100
# A tibble: 1 × 3
a b c
<dbl> <dbl> <dbl>
1 5.22 0.668 0.243
| MAE | Total Samples | Time (s) |
|---|---|---|
| 1.00 | 11654 | 0.31 |
| 0.75 | 20373 | 0.51 |
| 0.50 | 53788 | 1.25 |
| 0.40 | 105585 | 2.43 |
| 0.25 | 642067 | 14.61 |
| 0.20 | 2282010 | 50.31 |
Pros
Cons